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On the convergence of an efficient algorithm 
for Kullback-Leibler approximation of spectral 

densities 

Augusto Ferrante, Federico Ramponi, and Francesco Ticozzi 

Abstract 

This paper deals with a method for the approximation of a spectral density function among the 
solutions of a generalized moment problem a la Byrnes/Georgiou/Lindquist. The approximation is 
pursued with respect to the Kullback-Leibler pseudo-distance, which gives rise to a convex optimization 
problem. After developing the variational analysis, we discuss the properties of an efficient algorithm for 
the solution of the corresponding dual problem, based on the iteration of a nonlinear map in a bounded 
subset of the dual space. Our main result is the proof of local convergence of the latter, established as 
a consequence of the Central Manifold Theorem. Supported by numerical evidence, we conjecture that, 
in the mentioned bounded set, the convergence is actually global. 

I. Introduction 

During the last decade a broad research program on the interplay between (generalized) 
moment problems and analytic interpolation problems with complexity constraints, robust control, 
approximation and estimation of spectral density functions has been carried over by C. I. Byrnes, 
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T. Georgiou, and A. Lindquist and their co-authors and epigones HO), fl20), 0, 0H, 0, 0, 

eb, 0, 0, an, s, ma, am m, urn m, urn nn, ma, mi, oa, nn, eu, m. 

Moment problems have a long history and have been at the heart of many mathematical and 
engineering problems in the past century, see, e.g., 11341 . and the references therein. Only with 
recent developments of the above-mentioned research program, however, the parametrization of 
solutions in the presence of additional constraints on the complexity have been satisfactorily 
addressed lfT2l . This result, that has been possible thanks to a suitable variational formulation, is 
of key interest in control engineering. In fact, the well-known relation between moment problems 
and Nevanlinna-Pick interpolation problems allows for solutions of Hoc control problems that 
include a bound on the complexity of the controller, which is of paramount practical importance 
0, 0. Similar considerations apply to the covariance extension problem ifTOl . [|26l . Among 
the other applications, we also mention signal and image processing Il22l . [|23l , Il24l . [|25l , and 
Biomedical Engineering [|30l . These applications are based on a spectral estimation procedure 
that hinges on optimal approximation of spectral densities with linear integral constraints that 
may be viewed as constraints on a finite number of moments of the spectrum. The linear integral 
constraints represent a knowledge on the steady- state state covariance of a bank of filters that 
is designed in order to estimate the unknown spectral density $, while the to-be-approximated 
spectral density represents a prior knowledge on $, see Section [IT] for more details. As discussed 
in 0, [|26ll , ||32ll , this optimal approximation leads to a tunable spectral estimation algorithm 
that provides high resolution estimates in prescribed frequency bands even in presence of a short 
record of observed data. An important feature of the above mentioned optimal approximation 
method is that the primal optimization problem can be solved in closed form and, as long as 
the prior spectral density is rational, yields a rational solution with an a priori bound on the 
complexity (McMillan degree). 

The numerical challenge, in practical applications, lays with the dual problem. In fact, the dual 
variable is an Hermitian matrix and, as discussed in ll26ll . the reparametrization in vector form 
may lead to a loss of convexity. Moreover, the dual functional and its gradient tend to infinity at 
the boundary, leading to serious numerical difficulties in practical implementations. Indeed, any 
gradient-based numerical method is severely affected by heavy computational burden, due to a 
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large number of back- stepping iterations. These arise as consequence of the above-mentioned 
behavior of the gradient in the vicinity of the boundary. In order to avoid this computational 
slowdown, a nonlinear matricial iteration has been proposed in ll32l : It exhibits surprisingly good 
performance and it does not involve either back-stepping or the computation and inversion of 
the Hessian. Proof of convergence, however, has so far been a challenging open problem. This 
problem is eventually successfully addressed in this paper: Our main contribution is the proof 
that this iteration is locally asymptotically convergent to the manifold of solutions of the dual 
problem. Moreover, we analyze many other aspects of this matricial iteration and its dynamical 
properties in connection with the dual problem. We finally show that, by resorting to a spectral 
factorization method, the proposed iteration may be implemented in an effective way. In fact, 
each iteration of the algorithm only requires the solutions of a Riccati equation and of a Lyapunov 
equation, for which robust and efficient algorithms are available. 

The paper is organized as follows. In Section [TT] we give a proper mathematical statement of the 
problem, and proceed by recalling some relevant facts from the literature as well as establishing 



some preliminary results. Section III contains our main result: Local convergence of the matricial 
iteration. The proof is rather articulated and involves many different tools from linear and non- 
linear systems theory, including the Central Manifold Theorem. For this reason, this section is 
divided into many subsections that provide a roadmap of the various parts of the proof. As a 
byproduct, we also obtain many relevant results on the iteration and its linearization. In Section 



IV we describe how the proposed iteration may be implemented in an effective numerical way. 



Section [V] illustrates some results obtained from simulations and a conjecture. Final remarks, 



conclusions and future perspective are presented in Section VI 



Notation. We denote by S) n the set of Hermitian matrices of dimension n. Given a complex 
matrix A, A* denotes the transpose conjugate of A, while, for a matrix valued function x( z ) 
in the complex variable z, x*( z ) denotes the analytic continuation of the function that for 
\z\ = 1 equals the transpose conjugate of x( z )- Thus, for a matrix-valued rational function 
X(z) = H(zl - F)- 1 G + J, we have x*( z ) = G*{z- x I - F*)^ 1 !!* + J*. We denote by T the 
unit circle in the complex plane C and by C(T) the set of complex- valued continuous functions 
on T. C+(t) denotes the subset of C(t) whose elements are real- valued positive functions. 
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Elements in C + (t) will be thought of as spectral densities. 

II. Problem formulation and background material 
Consider the rational transfer function 



G(z) = (zl - A) B, A G C nxn , B G C , 

of the system 

x(t + l) = Ax(t)+By(t), 

where A is a stability matrix, i.e. has all its eigenvalues in the open unit disc, and (A, B) is a 
reachable pair. 

The transfer function G models a bank of filters fed by a stationary process y(t) of unknown 
spectral density &(z). We assume that we know (or that we can reliably estimate) the steady-state 
covariance S of the state x of the filter. Based on £ and on an a priori information in the form 
of a prior spectral density *&(z), we want to estimate the spectral density &(z). 

We will consider the Kullback-Leibler index as a measure of the difference between spectral 
densities * and $ in C+(t): 

The above notation, where integration takes place on the unit circle and with respect to the 
normalized Lebesgue measure, is used throughout the whole paper. 
As in ll26ll . we consider the following 

Problem 2.1: (Approximation problem) Let ^ G C+(T), and let S G C nxn satisfy S = S* > 
0. Find $ that solves 

minimize S(^||$) (1) 
over ($gC + (t) f C7$C7* = S 1 . (2) 



Remark 2.1: Notice that in order to guarantee that the Kullback-Leibler pseudo-distance is 
greater than or equal to zero we need that the zeroth-order moment of its arguments is the 
same, i.e. if J \I> = J$ then §( 1 I'||$) > 0. For the minimization problem to make sense as an 
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approximation, we need precisely this condition. In [|26ll it is shown that, when A is singular, 
the zeroth-order moment of all the spectra $ compatible with the constraint J G&G* = E is 
constant, say J $ = a. Without either of the singularity of A or the equality of the zeroth-order 
moments of and all of the $'s, it is not clear at all if § serves as a pseudo-distance, even if 
the minimization problem continues to be valid. That is, it is not clear whether we can speak 
about "approximation" anymore. In view of this consideration, we require from now on that A 
has at least one eigenvalue at the origin, and that \I> is rescaled accordingly in order to obtain 
J ^ = a. (Hence, what we approximate is the "shape" of not \& itself.) 
If A is non-singular, it is still possible to consider a weighted version of the Kullback-Leibler 
pseudo-distance in such a way that the problem maintain the meaning of spectral approximation. 

To simplify the writing, we can, without loss of generality, normalize E and ty. Indeed, if 
S ^ I, it suffices to replace G by G' := Tr^G and (A, B) with (A 1 = E^AE 1 / 2 , B' = 
E -1 / 2 !?) to obtain an equivalent problem where E = /. 



In a similar fashion, if J $ = J $ = a ^ 1 (compare with Remark 2.1 ), let \E' / := /a and 
G' = y/a G. Then, to any solution $ to the moment problem J G&G* = E there corresponds 
a solution $' to the problem f G'&G'* = E, where $' = It is immediate to check that 

§(^ / ||$ / ) =S(#||$)/a, 

which ensures that the positivity of the pseudo-distance is preserved. Therefore, we can assume 
that = 1. 

The first issue one needs to worry about is existence of $ G C + (T) satisfying constraint §2§. 
It has been shown that the following conditions are equivalent 11261 : 

1) The family of $ satisfying constraint ([2]) is nonempty. 

2) there exists H G C lx ™ such that 

I - AA* = BH + H*B*, (3) 

3) the following rank condition holds 

('-' 

rank I = rank I I (4) 



I -AA* 


b y 




f 








= rank 




: 


B* 


o / 




I B" 
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A fourth equivalent condition is based on a linear operator that will play a crucial role in the 
rest of the paper: Let Sj n be the space of Hermitian matrices of dimension n, and consider the 
linear operator 

r: C(T) — > f,„ 

It is clear (recall that £ = /) that there exists $ G C(T) satisfying §2§ if and only if 

/ G Range T. (6) 

Indeed, it has been shown [fT8l that condition ([6]) is necessary and sufficient for the family of 
$ in ([2]) to be nonempty. Thus, condition (|6]) will be a standing assumption for this paper. We 
endow the space S) n of Hermitian matrices with the inner product 

(P,Q) :=tr (PQ). 

The orthogonal complement of Range T (defined with respect to this inner product) has been 
shown in |[T8l to be given by 

Range T 1 - = {X G S) n : 

G*{^)XG{^) = 0, We [0,2tt]} . (7) 

Notice that, in view of (|6]), we have 

tr (X) = (X, J) = 0, VX G Range T ± . 



A. Variational analysis 



To solve Problem 2.1 we consider a matrix Lagrange multiplier A G S) n satisfying G*AG > 
on all of T, and define the Lagrangian functional 

L($,A) := §(^11$) + ^A, J G&G* - l\ 

= S(tt||$) + y G*AG$ -tr(A). (8) 

This functional is easily seen to be strictly convex. Therefore unconstrained minimization of 
L($,A) can be achieved by annihilating the directional derivative D(L(Q, A), 5$) along all 
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directions 5$ £ C(t). In this way we get the following form for the optimal solution as a 
function of the Lagrange multiplier A. 



G*AG 

Now, it is clear that if A D = A* satisfies 

G*A D G > 0, £ T, 

I 



G—-——G 



G*A Q G 



then 



$o := 



G*A Q G 



(9) 

(10a) 
(10b) 

(11) 



is optimal for Problem 2.1 As for many optimization problems, the most delicate issue is 
existence. This issue has been addressed in ll26ll . ifTTIl where the following result has been 
proven. 



Theorem 2.1: There exist matrices A D = A* such that (10) hold. For any such a A D , $ D given 



by (11 ) is the unique solution of the Approximation Problem (2.1 ). 



Remark 2.2: Notice that, since Problem (2.1 ) admits a unique solution, if A D and A' D are two 



matrices satisfying conditions (10), then 



G*A G 



fr^ so that we clearly have G*(A -A' )G 



CK'G 



0, or equivalently, A D — A' D £ Range Y . Conversely, it is clear that if A D satisfies conditions ( 10) 



then (A D + X) also satisfies conditions (10) for any X £ Range T . Thus, the family C Q of all 



solutions of (10) is an affine space that may be parametrized in terms of an arbitrary solution 

A , as 

C = {A + X; Xe (Range T) 1 }. (12) 



Moreover, for any A D satisfying (10), we have tr J G G G*A = tr A D and, using the cyclic 



property of the trace we immediately get: 



tr A r 



By duality theory, a A D satisfying (|T0|) may be computed by maximization of the dual functional 



A ^ inf{L($,A)|$ £ C+(T)}. 
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The latter, in view of the previous discussion, may be explicitly written as: 

a ^ l (g^g' a ) = / ^ l °sG*AG-tr(A) + / 



Consider now the maximization of the dual functional (13) over the set 



C + ■= {A = A*\G*AG > 0,W* E T}. 



Let 



J*(A) :-- 



*logG*AG + tr (A). 



The dual problem is then equivalent to 

minimize {j^(A)|A E £+}■ 



(13) 



(14) 



(15) 



As discussed in the Introduction, the bottleneck of the whole theory and of its numerous 



applications is now the numerical computation of a A D satisfying (10) or, equivalently, solving 
<p~5|). To this aim the following algorithm has been proposed in [|32ll and further discussed in 



ma. 



B. Iterative algorithm 
For A > 0, let 



6(A) := / A 1/2 G 



G*A 1/2 . 



(16) 



G*AG_ 

It has been shown in 021 that is a map from density matrices to density matrices, i.e. if A is a 
positive semi-definite Hermitian matrix with trace equal to 1, then 6(A) has the same properties. 
Density matrices have long been studied in statistical quantum mechanics, representing quantum 
states in the presence of uncertainty ||3~5l , B3TJ. Moreover, maintains positive definiteness, i.e., 
if A > 0, then 0(A) > 0. In addition to this, the following holds: 

Proposition 2.1: The matrix 0(A) has the same rank and, indeed, the same kernel of the 
matrix A. 

Proof: By taking into account that ker(A x / 2 ) = ker(A), it is immediate to check that if 
v E ker [A] then v E ker [0(A)]. Conversely, it is sufficient to prove that 



G 



G*AG 



G* > 0. 



(17) 
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Indeed, if this is the case and v G ker [6(A)] then A 1//2 t> = 0, so that v G ker [A]. To prove ( 17), 
we observe that G ? AG is continuous and strictly positive on T and hence has a positive minimum 
there. It is therefore sufficient to show that J GG* is positive definite. The latter integral is the 
steady-state state covariance of the filter G driven by normalized white noise, i.e., the unique 
solution H of the discrete-time Lyapunov equation H — AEA* = BB*. In view of controllability 
of the pair (A, B), it is clear that S is positive definite. □ 

Consider the sequence {A fc } produced by the following iteration 



A 



k+l 



e(A A 



(18) 



with an arbitrary initial condition A > 0. Notice that, since each A k produced by the previous 
algorithm is positive definite and has trace equal to 1, we also have A k < I \/k > 0. If the 
sequence {A fc } converges to a limit point A > then such a A is a fixed point for the map 6 



in (16) 



A := 



1 k l/2 G 






_g*Kg_ 



G*A 1/2 . 



(19) 



By multiplying the latter by A 1 I 2 on both sides, it is clear that A satisfies (10) and hence 



provides a solution of Problem 2.1 



Notice that, even if all A k are positive definite, it may happen that the sequence {A k } converges 
to a limit point A s which is positive semidefinite but singular. In this case, it is not guaranteed 



that A s satisfy (jlOj). 

We observe that if A D > is a fixed point of 0, then for any Aj_ G (Range T)- 1 , A D + Aj_ is 
also a fixed point of 0, as long as A D + A^ > 0. In fact, in view of Remark |2.2[ we have 

0(A O + A ± ) = (A + A ± ) 1 / 2 /(A + A ± y/ 2 = A + A ± . (20) 

In a wide series of simulations, we have observed that A^ always converges to a limit point. 
In only one case such a limit point was a singular matrix. Also in that case, however, the limit 



point satisfies (10) and hence provides a solution of Problem 2.1 



III. Proof of convergence 
In this section we prove the main contribution of the paper, namely that the intersection 



between the affine family of solutions of (10) and the cone of positive definite matrices is a 
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locally asymptotically stable manifold for the iteration <fT6]). 



A. Existence of a positive definite A D 

Once again, the first issue that must be addressed is an existence result: We have to show that 



£ o+ :={A o = A:>0: A g£ }^ 



(21) 



where C Q is defined in ( 12 ). To this aim we need a preliminary result in the same vein of Lemma 
9 in ||26l . The latter has been established in a slightly different setting and using an abstract 
functional- analytic approach. We will instead use a direct algebraic approach that provides a 
constructive proof. 

Lemma 3.1: If G*A Q G > 0, Ve j,? G T, then there exists a vector C a E C" xl such that 

G* A ft ft* ft ft* ft 



Proof: As shown in Lemma A.l in the Appendix, we can obtain a decomposition 



G*AG = W*W, 



(22) 



where the (right) spectral factor W{z) is given by (66). Denoting by P the stabilizing solution 



of the Riccati equation (67), by using (71 ), W may be explicitly expressed in the form 

W = (B*PB)- 1/2 B*P (A(zl - A)^ 1 + /) B. (23) 
It is immediate to check that A(zl — A)^ 1 + I = z(zl — A)^ 1 so that 

W = z(B*PB)- 1/2 B*P(zI - A)- 1 B. (24) 



and thus 



with 



G*AG = W*W = W*W U 



(25) 



W\ := z~ x W = (B*PB)- l/2 B*P(zI - A)- l B. (26) 

Therefore, the vector C Q indeed exists and may be explicitly expressed as C Q = ((B* PB)~ 1 / 2 B* P)' 
□ 



Theorem 3.1: The set C a+ defined by (21) is nonempty and it is an open convex subset of 
the affine space C Q . 
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Proof: Let A D G C Q (recall that Theorem [2T| guarantees that C Q ^ 0) so that G*A D G > 
0, Ve jl? G T. From Lemma 3.1 we know that this implies the existence of a vector C Q G C nxl 



such that G*A Q G = G*C C* G. 

On the unit circle T, G*A Q G is continuous and positive and hence 

li := mm{G(z)*A G(z) : z G T} > 0. 
Similarly, on the unit circle T, G*G is continuous and hence 

v := max{G(z)*G(z) : zGl} 
is finite. Let e := ^. Clearly, Vz6T, 

G(z)*(^A - el)G(z) = ^G(z)*A G(z) - eG(z)*G(z) 



Hence, exploiting again Lemma 3.1 we conclude that there exists C\ G C such that 



G*(^A - eI)G = G*C 1 C* 1 G. 

Therefore we have 

= G*(^C G * + eI)G + ^G*G C *G - £ G*G 

= G*(^C C* + eI)G + ^G*A D G - eG*G 

= G*( l -C Q C: + eI)G + G*(^A - eI)G 

= G*(^C C* + eI)G + G*C 1 C* 1 G 

= G*( l -C C: +el + C 1 C* 1 )G = G*A 0+ G, 

where A D+ := \C Cl + el + C\C\ is clearly positive definite and hence A D+ G C +. The fact 
that C Q+ is an open convex subset of C a is an immediate consequence of the fact that the cone 
of positive definite matrices is open and convex together with the fact that C is an affine space. 

□ 
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B. Linearization 

Given that C Q+ is non-empty, we can now pick a point A D G £ + and analyze the map in 
a neighborhood of A Q . To this aim we linearize the map 0, namely we compute the directional 
derivative of at A D in the direction specified by an arbitrary Hermitian matrix X: 

C (e ( A a ),.Y): = l, m 9 ' A ° + £X »- e ' A °». 

1 /2 

In order to find an explicit form for this derivative, we first need an expression for D(A , X). 

1) Derivative of the matrix square root: For a given function g : S) n — > Sj n , let us take the 
directional derivative of (g(A)) 2 in the direction X. The chain rule gives: 

D(g(A) 2 ; X) = D(g(A); X) g(A) + g(A) D(g(A); X) 

Now if g(A) = A 1 / 2 , we have ^(A) 2 = A so that clearly D(g(A) 2 ;X) = D(A;X) = X. In 
conclusion, we get that the derivative D(A 1 ^ 2 ;X) is the solution of the following Lyapunov 
equation: 

D(A 1/2 ; X) A 1 ' 2 + A 1/2 D(A 1/2 ; X) = X (28) 



2) Derivative ofQ: Let us take the variation of (16) in a direction X. By applying the chain 
rule we get: 



D(e(A);X) = D(A 1/2 ; X) J 



G*AG 



+AV2 K/iS iX ) AI/2 

D(A 1/2 ; X) J 

_ A 1/2 f G^G* G*XG 1/2 
J G*AG G*AG 



G*AG 



(29) 



We now compute the latter expression at A = A D and take (10b) into account. This yields: 



D(e(A );X) = DiAl'^X)! A 1 ' 2 + A 1 ' 2 I D^J 2 ^) 



v / G^G* G*XG 1/2 30 
G*A n G G*A n G ° ' 
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which, by property ( |28[ ), may be rewritten as 

Let us define the linear map M : Sj n — > fj n as the above derivative: 

The linear map jtft is then the derivative of 6 computed at a given fixed point A D . 

Adopting a system-theoretic approach, we can consider the sequence of increments {X k } 
with X k = A k — A Q and the linear system X k+ i = ^C(X k ) as the linear approximation of the 
nonlinear discrete-time system A^+i = O(Afc) in the neighborhood of its equilibrium point A D . 
If all the eigenvalues of */# lied in the open unit circle, then we could immediately conclude 
that A is asymptotically stable. However, this is not the case. In fact, it is immediate to check 
that 

Jt{X L ) = X x , V X± G Range F 1 (33) 

so that M restricted to Range T 1 - is the identity operator. We thus need a more sophisticated 
analysis. 

C. Properties and spectrum of ^# 

First, notice that ^# maps Sj n in itself but it is not self-adjoint (with respect to the inner 
product defined in S) n by (X, Y) = tr XY). Indeed, given X, Y 6 $) n , it may happen that 

(^(X),Y)^(X,^(Y)) 

so it is not a priori true that the eigenvalues of are real and that the eigenmatrices of 
span the whole space Sj n . 

A second observation is stated in the following result. 
Lemma 3.2: For any X £ S) n , \xM[X) = 0. 
Proof: We have 

t,^(A-)=t r .Y-/^ 



G*A a G G*XG 



iG G* A Q G 
G^G* 

trX-tr / - - X = trX -tr/ X = 

G*A Q G 
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□ 

We are now ready to analyze the spectrum of J% , Let Y be an eigenmatrix of M and a be the 
corresponding eigenvalue, namely Y is a non-zero Hermitian matrix such that ^(Y) = a Y. 



Due to ( |34| ), we have 

atrY = (35) 

Thus, we have the following corollary. 

Corollary 3.1: Let Y be any eigenmatrix of and assume trY ^ 0. Then, the corresponding 
eigenvalue is zero. 

Notice that A D is one such eigenmatrix. Indeed: 

^ a °- a ' /2 /^S^ a;/2 
^- a ' /2 /^ a;/2 

= k - ai' 2 i a; /2 = o. 

Let Y be an eigenmatrix of and a be the corresponding eigenvalue. We want to compute 



bounds for a. In view of Corollary 3.1 we can assume trY = 0. We have: 



or, equivalently, 



aY = Y — A 1 / 2 / G * G * °* YG A 1/2 
' G*A a G G*A a G ° 



'1 - a)Y = A 1 / 2 / G * G * G * YG A 1/2 . 

' G*A GG*A G ° 



1/2 1/2 

Since A D > 0, we can multiply both members by A D on the left side, and by A D Y on the 
right side. This yields 



1 ' ° J G*A D G G*A a G 



which, by taking the trace on both members and exploiting the cyclic property of the trace, 
implies 

We now observe that 



(G*YG) 2 



tr [A-^yA-Vsy] = tr [(A^^YA; 1 / 4 ) 2 ] 
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which is strictly positive because it is the square of the Frobenius norm of the nonzero matrix 
Ao 1 ^ 4 FAo ^ 4 . In conclusion we get 



a 



J * (G*A G) 2 



(36) 



tr [(Ao 1/4 yA- 1/4 )2 

The right-hand side of pB) is clearly real and non-negative. Indeed, it vanishes if and only 
if G(z)*YG(z) is identically zero on T or, equivalently if an only if Y G (Range r)^. The 
following theorem is thus proven. 

Theorem 3.2: All the eigenvalues of the map jtft are real. For any eigenmatrix of jtft that 
is not in (Range T) 1 , the corresponding eigenvalue is strictly smaller than 1. On the space 
(Range r)^, M acts as the identity operator. 

Remark 3.1: The above theorem may be interpreted as follows. We define x := vec(X) as 
the column vector (with n 2 entries) obtained by stacking the columns of X one over the other. 
Let V be a matrix whose columns form a basis for {x = vec(X) : X G (Range T) 1 }. Let W 
be such that 

T:=[V\ W] (37) 
is nonsingular. Let x := T~ l x = T _1 vec(X). Clearly, a; is a coordinate representation of X. 



Theorem |3.2| states that, with respect to these coordinates the linear map M is represented by a 

M 12 



matrix M of dimension n 2 x n 2 with the following structure: M 
is the dimension of (Range T) 1 - and 

a{B) C (-oo,l). 







B 



, where n± 



(38) 



Clearly, since I n± and B have disjoint spectra, we can select W in (37) such that M 12 = 0, i.e 
M ha the structure 

B 

It remains to establish a lower bound for the spectrum of B. 



M 



(39) 
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D. Eigenvalues of ^# are non-negative 

In order to provide a lower bound for the spectrum, we shall consider the linearized map as 
the generator of a continuous-time semigroup evolution, for which a key spectral property will 
be derived. 

1) M is the opposite of a Lindblad generator: We observe that the operator jft may be 
written in the form 



JZ{X)=X- J 



XT/1/2 

A 1 J 2 G———G* 
G*A n G 



X 



G———G*A 1 J 2 
G*A n G 



= X - J LXL* (40) 

where L := A^ 2 G q*£* g G*. It is immediate to check that L*L = G G Jj[ Q G*, so that clearly 
J L*L = I. Therefore, we can write write — jjt{X) as a (generalized) Lindblad generator [|27ll j^] 

— M{X) = J LXL* — ^[XL*L + L*LX] (41) 

2) A continuous-time evolution: We now consider the following continuous-time linear system 

X(t) = -JZ{X{t)) = J LX(t)L* - ^[X(t)L*L + L*LX{t)\. (42) 

with state space being the set of traceless Hermitian matrices (notice that, since tr M[X) = 0, 



evolution (42) is trace preserving). This will be helpful in proving the following 
Theorem 3.3: All the eigenvalues of the map ^# are non-negative. 

Proof: Let a be an eigenvalue of M and Y be the corresponding eigenmatrix, so that the 



state trajectory generated by system (42) with initial condition X(0) = Y is X(t) = e at Y. We 
denote by ||^||i the sum of the absolute values of the eigenvalues of Y, i.e. ||F||i := |A|. 

\ea(Y) 

'it is remarkable to notice that, in the framework of quantum statistical mechanics, it has been shown by Lindblad 1271 that 
any trace-preserving, strongly-continuous semigroup of completely positive maps from density operators to density operators 
has a generator which can be written as the sum of an Hamiltonian (Liouvillian) term and number of terms of the form of the 
integrand in (|4TJ. Such Markov semigroups have long been studied for their relevance to many aspects of quantum theory and 
thermodynamics (2), (4). In this setting, their spectral properties have been investigated from an operator- theoretic standpoint. 
In order to avoid to overburden this paper with an unnecessary and rather technical detour, we choose here to prove the needed 
results by means of a linear algebraic tools. 
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Let Y P > and Y N > be the positive and negative parts of Y defined as follows: Let 
T*YT = D = Dp - D N , where T* = T~ x , D is a diagonal matrix, D P is obtained from 
.D by annihilating the negative entries and D N := — (D — D P ). Define Y P := TD P T* and 
Y N := TD N T*. Define also the orthogonal projection n P := TO P T* (U N := TO N T*), where 
Op (Oat) is the matrix obtained from D P (D N ) by setting to 1 all the non-zero entries. Clearly, 

HYlli = tr [Yp + Ytf], Y = Y P -Y N . (43) 

Moreover, 

Yp = U P Y, Y N = -T1 N Y, Y P Y N = Y N Yp = 0. (44) 



Recall now that, in view of Corollary 3.1 we can assume tr [Y] = 0. Thus, taking into account 



(43 ) and (44), we have = tr [Y] = tr [(U P +U N )Y] so that = tr [X(t)\ = tr [(IL P +IL N )X(t)]. 



Hence 



It is now easy to see that 



tr [U P j t X(t)} = -tr[U N j t X(t)}. 



^(*)l| 1 = ^l|e-^|| 1 = ^tr[e-(r P + ^)] 
= tr [^(e-^ (Tip -T1 N )Y)} 



: tr [(Up - li N )j t X(t)} = 2 tr [H P ^X(t)] 
- 2 tr [-U P J^(X(t))\ 

Jtx [U P [2LX(t)L* - X(t)L*L - L*LX(t))). 



(45) 



Define X P (t) := TL P X(t), and X N (t) := TL N X(t). Notice that 

X P (t) = X P (t)U P = TLpX P (t), (46) 
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and a similar equality holds for X N (t). We also have X P (t) —X N (t) = X(t) so that, by linearity, 



the integrand X of the last member of (45) may be written as X = X P + X N with 



If 



tr [Up[2LX P (t)L* - X P (t)L*L - L*LX P (t))] 

tr [2U P LXp(t)L* - X P (t)L*L - Tl P L*LX P (t))] 

2 tr [L*U P LXp(t) - L*LX P (t))] 

2 tr [L*(U P - I)LX P (t)] 

2 tr [[X P {t)\y 2 L*{U P - T)L[X P {t)]^\ < 0, 



(47) 



where we have used the cyclic property of the trace operator, equality (46), the fact that X P {t) = 
e~ at Yp > and eventually that lip — / < because lip is an orthogonal projection. As for X^, 
we have 



X N := tr [U P [-2LX N {t)L* + X N (t)L*L + L*LX N {t)\] 
= -2 tr [U P LX N (t)L*Up] < 0, 



(48) 



where we have used the fact that H P X^f(t) = 0, the cyclic property of the trace operator, the 
fact that lip = Up and eventually that X]y(t) > 0. In conclusion we have that X < so that 



dt 



\X(t)\\! = J X < 0. On the other hand 



d „ 
> —\\X 



d 

~d~r 



e- Qt y||i 



-ae 



-at I 



y\u- 



(49) 



so that a > (recall that, Y being an eigenmatrix, it is not the zero matrix and hence ||y||i > 0). 
□ 



Theorem 3.3 and (38 ) allow us to conclude that the matrix B in (39 ) is such that a(B) C [0, 1 



and hence it is a discrete-time stability matrix. 



E. Center Manifold theory 



Let us go back to the original non-linear map 0. The iteration (18) may be incrementally 
represented as A fe+1 — A D = 6(A fc ) — A D and, by Taylor series expansion, as 



X k+1 = JZ(X k ) +m(X k 



(50) 
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where we have defined X k := A k — A D and m is the residue function that vanishes with its first 



derivatives at the origin. Moreover, notice that, from (20), (33), and (50), we immediately get 



m(X ± ) = V X± G (Range T)- 



(51) 



Theorem 3.4: The set C Q+ is locally asymptotically stable for 0. 



Moreover, we partition x in the form x 

corresponding to (Range T)- 1 , and x r G C 
is represented by 



x 



Proof: We resort again to the coordinate representation of X introduced in Remark [34 

x 1 - 



where x G C n± is the component of x 



n these coordinates the incremental evolution ( 50 ) 



ry— L /y>— L I -p ( ry*— L ,-yi T ^\ 

x k+l — x k < J \ x k ) ^k) 



(52) 



x k+l ~ B X k + d( x k i 

where, as already discussed, B is a stability matrix. We are now in the setting of Center Manifold 
theory, see lfl4l pages 34-35]. The first and, in general, most difficult step to apply this theory is 
to find a center manifold, i.e. a C 2 function h : C n± — > C Ur that vanishes with its first derivatives 
at the origin, and such that the center manifold equation 



h^x 1 - + fix 1 , h{x L )) = Bh{x L ) + g^x 1 , /i(x x )) 



(53) 



is satisfied. In our situation, however, this equation admits a solution that may be computed very 



easily. In fact, in view of (51 ), it is immediate to check that 



f(x ± ,0) = 0, g(x ± ,0)=0, Vx ± . 



(54) 



Therefore, we may choose as a solution to equation (53) the identically zero function h. The 



asymptotic behavior of trajectories of (52) originating in a neighborhood of the origin is deter- 
mined by the flow on the center manifold whose dynamics is governed by the equation 

u k+1 =u k + f(uk, h{u k )) =u k + f(uk, 0) = u k . (55) 

Clearly, the zero solution of ([55]) is stable and thus, as stated in [|T4l Theorem 8, page 35]: 



1) The zero solution of (52) is stable. 
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2) There exists a solution w fc = u of (55) and two positive constants k and (3 < 1, such that 

14 - u\ < Kf3 k , \x r k \ = \x r k - h(u)\ < Kf3 k . (56) 



In conclusion, if the initial condition of (52) is sufficiently close to the origin, than 



Xk 



X 







i.e. Xk converges to a state representing an element of (Ranger)- 1 . This is equivalent to say that 



for any A D G C Q+ (defined in (21 )) there exists a neighborhood £>(A D ) such that all trajectories 



{A fc } generated by (18) and originating from £>(A D ), converge to C Q+ . Equivalently, C Q+ is 



locally asymptotically stable for 6. □ 

IV. Numerical implementation 
In this section we discuss a numerically efficient implementation of the integral in the iteration 



( 18 ). We show that it may be computed using very robust and reliable linear algebra algorithms. 
We want to compute J G G ? AG G*, where G, ^ and A are given and G*AG is positive on T. To 
this aim, we assume that \& is rational and, as a preliminary step we compute, using standard 
tools, a minimal minimum-phase spectral factor 

Wy(z) = H(zl - F)- 1 G + D 

of ^. We also employ the factorization 

G*AG = W*(z)W{z), 

with 

W := (B*PB)- 1/2 B*PA(zI - A)~ 1 B + (B*PB) 1/2 



derived in Appendix. Thus, we clearly have 



G———G* 



G*AG 



(Gw- l w<s,){Gw- l w^y 



(57) 



(58) 



so that the integral in (58) is the steady-state output covariance of the filter GW~ 1 Wy driven 
by normalized withe noise. Then, let us compute a state space realization of GW~ l W^,. First, 
we observe that: 



W~ l — [I — (B*PB)- 1 B*PA(zI - Z)- l B](B*PB)- 112 , 



(59) 
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where Z, defined in ( [68] ), is a stability matrix. Hence, 

GW- 1 = -(zl - A)- 1 B(B*PB)- 1 B*PA(zI - Z)- 1 x 
B(B*PB)- 1/2 + {zl - A)- 1 B(B*PB)- 1/2 , 



(60) 



Notice that 



B(B*PB)- L B*PA = A-Z={zI-Z)-{zI- A). 



Plugging this expression into ( |60| ) we get 

GW- 1 = -(zl - A)- 1 B(B*PB)- 112 
+{zl - A)- l B(B*PB)- 1/2 
+ {zl - Z)- 1 B(B*PB)- 1 ' 2 
= (zl - Z)- 1 B(B*PB)- 1/2 . (61) 
Eventually, it is now easy to see that GW~ l W^ has the following state space realization 

GW^Wy = [0 | I] (zl - P) 1 G (62) 



with 



F :-- 



F 

B(B*PBY l,2 H Z 



G :-- 



G 



B(B*PB)- 1 / 2 D 

Notice that F is a stability matrix. The following result comes now as a straightforward conclu- 
sion 

Proposition 4.1: Let H be the solution of the following discrete-time Lyapunov equation 



FEF* + GG*. 



(63) 



Then the integral in (58) is the bottom-right block of S, i.e., 



G 



G*AG 



G* = [0 | /] S 







(64) 



In conclusion, for each iteration of the algorithm (18) we only have to compute: the solution 
of an algebraic Riccati equation of order n, the solution of a discrete-time Lyapunov equation of 
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order n + n^ {n^ being the state space dimension of Wy{z)), and the square root of a positive 
definite matrix A. All these operations are accomplished by standard linear algebra algorithms 
that may be implemented by numerically efficient and robust routines. 

V. Evidence from simulations and a convergence conjecture 

In the previous section we have proven a local result. In an extensive campaign of simulations, 
however, we have always observed that the sequence {A fc } converges very fast to a A D in the 
closure C Q+ of C 0+ . Thus, we conjecture that indeed C 0+ is globally asymptotically stable for 0. 
To this extent, we can do a few considerations. The function maps the open of positive 
definite matrices with unitary trace to itself. Even if all fixed points in this open set are clearly 
in £ a+ , we cannot exclude that the sequence {A k } converges to the boundary of i.e. to a 
singular matrix. Indeed, it is easy to see that there is a whole family of singular matrices in the 
boundary of £P + that are fixed points of 0. This is the family of the 1-dimensional orthogonal 
projections. We have conducted some numerical experiments to understand the behavior of the 
map in the neighborhood of 1-dimensional orthogonal projections. We have observed that, 
even if we generate the sequence {A&} by choosing the initial condition arbitrarily close to a 
1-dimensional orthogonal projection LTi, the sequence always converged to C Q+ . For this reason 
we believe that, except for those in C a+ , the 1-dimensional orthogonal projections are unstable 
equilibrium points. A formal proof of this fact should probably adopt a (nonlinear) Lyapunov 
approach. In fact, the derivative of the square root in the neighborhood of a singular matrix (as 
is an orthogonal projection) is infinite and thus a proof based on linearization does not seem 
viable. 

A second remark concerns the values of along the trajectory {A&} generated by iterating 
0. It has been shown in ifTTI that AA fc := A fc+1 — A k is a descent direction for J%. Indeed, 
experimental evidence in numerical simulation is that more is true: J# always decreases along 
trajectories {A fc }. This fact, if proven, would be an important step toward a Lyapunov argument 
for global convergence. 
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VI. Conclusions 

The bottleneck of the spectral approximation techniques based on convex optimization is the 



solution of the dual problem ( |2.1[ ) 



So far, a closed form solution is not available for the general case, at least as long as the 
relative entropy functional is chosen as a spectral pseudo-distance. Thus, an iterative algorithm 
has been proposed in 11321 , IfTTll in order to obtain the dual solution numerically. Necessary 
conditions for such an algorithm to be of interest are clearly its numerical efficiency and, most 
important, its convergence features. 

While the proposed nonlinear iteration exhibits exceedingly good convergence features in all 
the simulation tests, a convergence result was missing. Our main result proves that the iteration 
is locally convergent to the manifold of full-rank solutions for the dual problem. The path to this 
results is quite tortuous yet provides new insights on the dynamics associated to the iteration, 
offering a potential standpoint towards the extension of the result to global convergence. 

We first develop a detailed analysis of the linearized map and of its spectral properties, also by 
resorting to an associated continuous-time evolution which provides us with a lower bound for 
the spectrum. Then the center manifold theorem, along with a desirable property of the nonlinear 
map, let us conclude that the manifold of solution is locally asymptotically stable. 



The iterative algorithm presented in Section II-B is thus proven to be an eligible candidate for 



being the missing piece towards a satisfactory, feasible solution of the spectral approximation 
problem in the general case. 

As we conjectured in the previous section, supported by numerical simulations, we believe 
it should be possible to prove that convergence is almost global, namely that all the stationary 
points that are not in £ + are m fact repulsive. Technical difficulties rule out a linearization 
approach, suggesting a general Lyapunov analysis as the natural pathway to the desired result. 
This indeed represents the most challenging yet compelling direction for further work. 

We wish to thank prof. Christopher Byrnes for the insightful discussions and for the advice 
he gave us during his staying in Padova. 
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Appendix 

In the following we present a factorization result that is repeatedly used in different parts of 
the paper. 

Lemma A.l: Let G(z) = (zl - A)- 1 B with A G C nxn , B G C nxl , and (A, B) a reachable 
pair. Let A G Sj n be such that G*AG > on T. Then, the following factorization holds: 



G*AG = W*W, 



where 



W := (B*PB)- 1/2 B*PA(zI - A)~ 1 B + (B*PB) 1/2 
and P G Sj n is the stabilizing solution of the algebraic Riccati equation 

IT = A*TIA - A*UB(B*TIB)- 1 B*TIA + A, 
so that the spectrum of closed loop matrix 

Z := A - B(B*PB)- 1 B*PA 



(65) 



(66) 



(67) 



(68) 



lays inside the open unit disk. 

Proof: Given G(z) = (zl — A)~ 1 B and an arbitrary II G S) n , the following identity holds 



[G*(z) | 1] 



G(z) 
1 



0. 



(69) 



r A*UA - II A*UB 
B*UA B*UB 

Therefore, we have 

A + A*UA - II A*UB 
B*UA B*UB 

Since G*AG is positive on the whole T, there exists the stabilizing solution P of the algebraic 



G*AG = [G* | 1] 





G 




1 



(70) 



Riccati equation (67). 



Thus if we set II = P, the block matrix on the right hand side of (70) has the the following 
factorization 



A*PB 
B*PB 



(B*PB)~ 1 [B*PA | B*PB] 
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so that 



G*AG = [G*A*PB + B*PB](B*PB)- l [B*PAG + B*PB] 



W*{z)W(z), 



(71) 



with W(z) given by (66). □ 
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